title: An even more distributed ActivityPub
date: 2016-10-06 04:43
author: Christine Lemmer-Webber
tags: activitypub, decentralized, distributed, federation
slug: an-even-more-distributed-activitypub
---
So [ActivityPub](https://www.w3.org/TR/activitypub/) is nearing
Candidate Recommendation status. If you want to hear a lot more about
that whole process of getting there, and my recent trip to TPAC, and
more, [I wrote a post on the MediaGoblin blog about
it](http://mediagoblin.org/news/tpac-2016-and-review-activitypub.html).

Last night my brother Stephen came over and he was talking about how he
wished ActivityPub was more of a "transactional" system. I've been
thinking about this myself. ActivityPub as it is designed is made for
the social network of 2014 more or less: trying to reproduce what the
silos do, which is mutate a big database for specific objects, but
reproduce that in a distributed way. Well, mutating distributed systems
is a bit risky. Can we do better, without throwing out the majority of
the system? I think it's possible, with a couple of tweaks.

-   The summary is to move to objects and pointers to objects. There's
    no mutation, only "changing" pointers (and even this is done via
    appending to a log, mostly).

    If you're familiar with git, you could think of the objects as well,
    objects, and the pointers as branches.

    Except... the log isn't in the objects pointing at their previous
    revisions really, the logging is on the pointers:

        [pointer id] => [note content id]

-   There's (activitystreams) objects (which *may* be content addressed,
    to be more robust), and then "pointers" to those, via signed
    pointer-logs.

-   The only mutation in the system is that the "pointers", which are
    signed logs (substitute "logs" for "ledger" and I guess that makes
    it a "blockchain" loosely), are append-only structures that say
    where the new content is. If something changes a lot, it can have
    "checkpoints". So, you can ignore old stuff eventually.

-   Updating content means making a new object, and updating the
    pointer-log to point to it.

-   This of course leads to a problem: what identifier should objects
    use to point at each other? The "content" id, or the "pointer-log"
    id? One route is that when one object links to another object, it
    could link to both the pointer-log id and the object id, but that
    hardly seems desirable...

-   Maybe the best route is to have all content ids point back at their
    official log id... this isn't as crazy as it sounds! Have a three
    step process for creating a brand new object:

    -   Open a new pointer-log, which is empty, and get the identifier
    -   Create the new object with all its content, and also add a link
        back to the pointer-log in the content's body
    -   Add the new object as the first item in the pointer-log

-   At this point, I think we can get rid of *all* side effects in
    ActivityPub! The only mutation thing is append-only to that
    pointer-log. As for everything else:

    -   Create just means "This is the first time you've seen this
        object." And in fact, we could probably drop Create in a system
        like this, because we don't need it.
    -   Update is really just informing that there's a new entry on the
        pointer-log.
    -   Delete... well, you can delete your own copy. You're mostly
        informing other servers to delete their copy, but they have a
        choice if they really will... though that's always been true!
        You now can also switch to the nice property that removing old
        content is now really garbage collection :)

-   Addressing and distribution still happens in the same, previous ways
    it did, I assume? So, you still won't get access to an object unless
    you have permissions? Though that gets more confusing if you use the
    (optional) content addressed storage here.

-   You now get a whole lot of things for free:

    -   You have a built in history log of everything
    -   Even if someone else's node goes down, you can keep a copy of
        all their content, and keep around the signatures to show that
        yeah, that really was the content they put there!
    -   You could theoretically distribute storage pretty nicely
    -   Updates/deletes are less dangerous

(Thanks to Steve for encouraging me to think this through more clearly,
and lending your own thoughts, a lot of which is represented here!
Thanks also to Manu Sporny who was the first to get me thinking along
these lines with some comments at TPAC. Though, any mistakes in the
design are mine...)

Of course, you can hit even more distributed-system-nerd points by
tossing in the possibility of encrypting everything in the system, but
let's leave that as an exercise for the reader. (It's not too much extra
work if you [already have public keys on
profiles](http://w3c.github.io/activitypub/#authorization-lds).)

Anyway, is this likely to happen? Well, time is running out in the
group, so I'm unlikely to push for it in this iteration. But the good
news, as I said, is that I think it can be built on top without too much
extra work... The systems might even be straight-up compatible, and
eventually the old mutation-heavy-system could be considered the
"crufty" way of doing things.

Architectural astronaut'ing? Maybe! Fun to think about! Hopefully fun to
explore. Gotta get the 2014-made-distributed version of the social web
out first though. :)
